Information

This R Markdown document was created as part of an individual assignment for SMM634 at Bayes Business School, City St George's, University of London in Term 1 2025-26.

Bonus Interactive Dashboard Available

This static report is accompanied by a live R Shiny dashboard allowing one to test different tickers, exclude specific outlier periods, and adjust bootstrap simulation parameters.

Update this link!!!! Launch Interactive Dashboard

0.1 Custom functions (to hide in final version)

1 Introduction

Healthcare expenditure represents a substantial share of economic activity in many high-income countries. In the United States, national health spending is projected to reach 20.3% of Gross Domestic Product (GDP) by 2033, up from 17.6% in 2023 (Keehan et al. 2025). Accurately modelling healthcare utilisation and expenditure is essential for forecasting budgetary pressures, planning service capacity, and designing policies that can meet the needs of an ageing population in an efficient and sustainable way.

The Agency for Healthcare Research and Quality (AHRQ) publishes the Medical Expenditure Panel Survey (MEPS), which provides detailed information on healthcare utilisation, associated expenditures, insurance coverage, and socio-demographic characteristics, all standardised to represent a full calendar year for each respondent (Agency for Healthcare Research and Quality 2023). These data enable one to quantify how healthcare use and costs vary across population groups and to identify factors associated with higher or lower spending, which is valuable for targeting interventions and evaluating potential policy reforms.

From a modelling perspective, healthcare expenditure data pose several distributional challenges: they are non-negative, highly right-skewed (a few patients incur massive costs), and contain a large proportion of individuals with no recorded healthcare useage. Understanding both whether individuals access healthcare and, conditional on doing so, how much is spent is crucial for characterising demand and the resulting financial burden. This report uses 2012 MEPS data to investigate two primary outcomes related to physician services: the number of doctor consultations (dvisit), capturing healthcare utilisation, and the annual expenditures on doctor visits (dvexpend), capturing the associated financial cost.

2 Methods

The 2012 MEPS dataset contains 10,638 observations on US adults aged 18–65. Tables 2.1 & 2.2 show the categorical and quantitative variables for both the full sample and a subsample of individuals with positive expenditure.

The covariates include demographic characteristics (age, gender, ethnicity, region, and education), socioeconomic status (income), health indicators (BMI, self-reported health general and mental health, hypertension, and hyperlipidemia). The number of non-physician visits (ndvisit) was retained as a proxy for the latent propensity to seek care.

Table 2.1: Frequency of categorical variables in the full sample of patients and the subset of patients with positive healthcare expenditure.
Variable Category
gender Female 0.53 0.62
Male 0.47 0.38
ethnicity White 0.70 0.71
Black 0.21 0.20
Native American 0.01 0.01
Others 0.09 0.08
education_cat_detailed Less than High School 0.22 0.18
High School Graduate 0.31 0.29
Some College 0.24 0.25
College Graduate 0.15 0.17
Post-Graduate 0.09 0.11
region Northeast 0.15 0.16
Midwest 0.19 0.21
South 0.39 0.38
West 0.26 0.24
hypertension No 0.75 0.65
Yes 0.25 0.35
hyperlipidemia No 0.77 0.68
Yes 0.23 0.32
Table 2.2: Summary statistics, mean (standard deviation), for continuous variables in the full sample and among individuals with positive healthcare expenditure.
Variable Description
bmi Body mass index 28.03 (6.41) 28.68 (6.79)
age Age (years) 40.25 (13.66) 43.32 (13.55)
education Education (no. of years) 12.77 (2.90) 13.13 (2.81)
income Income (USD) 60,817 (51,451) 65,688 (54,115)
dvisit Doctor visits 2.13 (3.63) 3.94 (4.14)
ndvisit Non-doctor visits 0.94 (2.91) 1.51 (3.62)
dvexpend Doctor expenditure 481.07 (1,646.79) 889.71 (2,156.92)
ndvexpend Non-doctor expenditure 159.59 (790.31) 266.51 (1,038.92)

2.1 Health-care expenditure

Doctor-visit expenditure (dvexpend) is semi-continuous, characterised by a point mass at zero (45.9% of observations) and a heavy right skew among positive values. To address this, a two-part model was used:

  1. Extensive margin: A Probit model estimates the probability of incurring any expenditure
  2. Intensive margin: A Generalised Linear Model (GLM) with a Gamma distribution and log link models the magnitude of costs, conditional on them being positive. This approach avoids the retransformation bias observed in log-normal models when dealing with heteroscedastic residuals (Blough, Madden, and Hornbrook 1999; Deb and Norton 2018)

For both parts, the choice of covariates was informed by a preliminary Lasso regression using the full set of candidate predictors, combined with prior domain knowledge. ndvisit was retained a priori in both parts as a proxy for latent propensity to use healthcare services. The final two-part model is shown in Equations (2.1) & (2.2). \[\begin{align} % Part 1: Probit \text{Part 1: }& \nonumber\\ \mathbb{P}\big[\text{dvexpend}_i > 0\big] &= \Phi(\eta_{1i}) \nonumber \\ \eta_{1i} &= \gamma_0 + \gamma_1 \text{Age}_i + \gamma_2 \text{Gender}_i + \gamma_3 \text{BMI}_i \nonumber \\ &\quad + \gamma_4 \text{Ethnicity}_i + \gamma_5 \text{Region}_i + \gamma_6 \text{Education}_i \nonumber \\ &\quad + \gamma_7 \text{General}_i + \gamma_8 \text{Mental}_i \nonumber \\ &\quad + \gamma_9 \text{Hypertension}_i + \gamma_{10} \text{Hyperlipidemia}_i \nonumber \\ &\quad + \gamma_{11} \text{Income}_i + \gamma_{12} \text{ndvisit}_i \tag{2.1} \\[10pt] % Part 2: Gamma \text{Part 2: }& \nonumber\\ \mathbb{E}\big[\text{dvexpend}_i &\mid \text{dvexpend}_i > 0\big] = \mu_i \nonumber \\ \ln(\mu_i) &= \beta_0 + \beta_1 \text{Age}_i + \beta_2 \text{Gender}_i \nonumber \\ &\quad + \beta_3 \text{Region}_i + \beta_4 \text{Education}_i \nonumber \\ &\quad + \beta_5 \text{General}_i + \beta_6 \text{Mental}_i \nonumber \\ &\quad + \beta_7 \text{Hypertension}_i + \beta_8 \text{Hyperlipidemia}_i \nonumber \\ &\quad + \beta_9 \text{Income}_i + \beta_{10} \text{ndvisit}_i \tag{2.2} \end{align}\] Where \(\eta_{1i}\) is the linear predictor for the Probit model, \(\gamma_j\) are the coefficients for the Probit, and \(\beta_j\) are the coefficients for the Gamma GLM with a log link.

2.2 Health-care utilisation

The count of doctor visits (dvisit) exhibited significant overdispersion (Variance: 13.1 > Mean: 2.1), violating the assumptions of a standard Poisson model. A Negative Binomial regression was fitted to account for the excess variability and lasso screening and prior domain knowledge were used to inform covariate choice (Equation (2.3)).

A Poisson regression model with the same covariates as the final specification was also fitted and a formal overdispersion test, comparing the Poisson residual deviance to a \(\chi^2\) distribution with the corresponding residual degrees of freedom, yielded a Pearson dispersion of 4.18 (p < 0.001). The Poisson model had a higher Akaike Information Criterion (AIC) than the Negative Binomial model (AIC\(_\text{Poisson}\) = 50386 versus AIC\(_\text{NegBin}\) = 37334), further supporting the justification for a Negative Binomial model.

\[\begin{align} % Count model: Negative Binomial \text{Count model: } \text{dvisit}_i &\sim \text{NegBin}(\mu_i, \kappa) \nonumber \\[4pt] \log(\mu_i) &= \beta_0 + \beta_1 \text{Age}_i + \beta_2 \text{Gender}_i + \beta_3 \text{BMI}_i \nonumber \\ &\quad + \beta_4 \text{General}_i + \beta_5 \text{Mental}_i \nonumber \\ &\quad + \beta_6 \text{Ethnicity}_i + \beta_7 \text{Region}_i \nonumber \\ &\quad + \beta_8 \text{Hypertension}_i + \beta_9 \text{Hyperlipidemia}_i \nonumber \\ &\quad + \beta_{10} \text{Income}_i + \beta_{11} \text{Ndvisit}_i \nonumber \\ &\quad + \beta_{12} \text{Education}_i \tag{2.3} \end{align}\]

To investigate the dependence structure between the frequency of visits and the intensity of expenditure beyond observed covariates, a copula approach was employed following the framework of Marra and Radice (2025b). Using Sklar’s Theorem, the joint cumulative distribution function, \(H(y_1,y_2)\), of utilisation (\(Y_1\)) and expenditure (\(Y_2\)) can be modelled by coupling their marginal distributions (\(F_1\) and \(F_2\)) via a copula function \(C\): \[ H(y_1,y_2) = C(F_1(y_1), F_2(y_2); \theta) \] where \(\theta\) is the association parameter. A Gaussian copula was fitted to the residuals of the marginal models for individuals with positive expendature using the GJRM package in R and Kendall’s \(\tau\), a measure of rank correlation, was estimated (Marra and Radice 2025a).

3 Results

Both the Gamma and Negative Binomial models were fitted with a log link, so their coefficients are interpreted multiplicatively. For a predictor \(x_k\) with coefficient \(\beta_k\), a \(\Delta\)-unit increase in \(x_k\) multiplies the mean (conditional mean cost in the Gamma model and expected visit count in the Negative Binomial model) by \[ \exp(\Delta \beta_k). \] and the percent change is given by \[ 100\left(\exp(\Delta \beta_k)-1\right)\%. \]

Table 3.1: Estimated coefficients (β), standard errors, and p-values from the two-part model for doctor-visit expenditure (dvexpend), including a Probit model for the probability of any expenditure (Part 1) and a Gamma GLM with log link for positive expenditure (Part 2).
Pt 1: Prob. of any expense
Pt 2: Expenditure amount
Term Estimate Std. Error p-value Estimate Std. Error p-value
(Intercept) -0.780 0.084 <0.001 5.634 0.129 <0.001
age 0.008 0.001 <0.001 0.009 0.002 <0.001
bmi 0.006 0.002 0.009
income 0.000 0.000 <0.001 0.000 0.000 0.002
ndvisit 0.113 0.008 <0.001 0.070 0.007 <0.001
General Health (ref: Excellent)
Poor 0.761 0.111 <0.001 1.090 0.160 <0.001
Fair 0.538 0.063 <0.001 0.598 0.112 <0.001
Good 0.301 0.046 <0.001 0.306 0.089 <0.001
VGood 0.159 0.041 <0.001 0.023 0.083 0.779
Mental Health (ref: Excellent)
Poor 0.156 0.139 0.259 0.262 0.206 0.203
Fair 0.113 0.071 0.114 0.147 0.119 0.214
Good -0.053 0.044 0.223 0.057 0.081 0.482
VGood -0.034 0.039 0.385 0.087 0.075 0.243
Education (ref: Less than High School)
High School Graduate 0.082 0.037 0.027 0.142 0.078 0.069
Some College 0.241 0.040 <0.001 0.251 0.081 0.002
College Graduate 0.358 0.047 <0.001 0.290 0.092 0.002
Post-Graduate 0.397 0.058 <0.001 0.409 0.107 <0.001
Ethnicity (ref: White)
Black -0.054 0.034 0.113
Native American 0.179 0.149 0.232
Others -0.146 0.048 0.002
Gender (ref: Female)
Male -0.525 0.027 <0.001 -0.195 0.054 <0.001
Hyperlipidemia
Yes 0.445 0.037 <0.001 0.138 0.063 0.029
Hypertension
Yes 0.360 0.036 <0.001 0.097 0.063 0.128
Region (ref: Northeast)
Midwest 0.034 0.045 0.455 -0.017 0.084 0.838
South -0.098 0.040 0.013 -0.192 0.077 0.012
West -0.178 0.043 <0.001 0.006 0.082 0.945
Table 3.2: Estimated coefficients (β), standard errors, and p-values from a Negative Binomial regression model for the number of doctor visits (dvisit).
Term Estimate Std. Error p-value
(Intercept) -0.689 0.092 <0.001
age 0.010 0.001 <0.001
bmi 0.008 0.002 <0.001
income 0.000 0.000 <0.001
ndvisit 0.101 0.004 <0.001
General Health (ref: Excellent)
Poor 1.178 0.095 <0.001
Fair 0.812 0.064 <0.001
Good 0.467 0.050 <0.001
VGood 0.234 0.046 <0.001
Mental Health (ref: Excellent)
Poor 0.588 0.119 <0.001
Fair 0.368 0.068 <0.001
Good 0.059 0.046 0.202
VGood 0.061 0.042 0.147
Education (ref: Less than High School)
High School Graduate 0.095 0.041 0.022
Some College 0.286 0.044 <0.001
College Graduate 0.387 0.051 <0.001
Post-Graduate 0.453 0.060 <0.001
Ethnicity (ref: White)
Black -0.101 0.037 0.007
Native American -0.070 0.158 0.656
Others -0.126 0.053 0.018
Gender (ref: Female)
Male -0.629 0.029 <0.001
Hyperlipidemia
Yes 0.369 0.037 <0.001
Hypertension
Yes 0.318 0.037 <0.001
Region (ref: Northeast)
Midwest 0.031 0.047 0.516
South -0.133 0.043 0.002
West -0.193 0.046 <0.001

3.1 Model 1: Health care expenditure

The two-part model effectively captures the dual processes of access and cost. In the first stage (Probit), females, older adults, and those with chronic conditions (hypertension, hyperlipidemia) were significantly more likely to incur expenditure (Table 3.1). In the second stage (Gamma GLM), conditional on seeking care, significant disparities in expenditure emerged. A 10-year increase in age is associated with approximately a 9% increase in conditional costs. Interestingly, males incur approximately 17.7% lower costs than females (p < 0.001). This finding aligns with work by Bertakis et al. (2000) who suggests women have higher healthcare engagement and diagnostic usage during reproductive years and for preventive screenings.

Self-reported general and mental health showed a strong effect with individuals who report “Poor” health incurring 197% higher costs than those in “Excellent” health. These patterns are consistent with clinical expectations - individuals who are sicker and/or more health-engagedboth enter the system more often and incur higher costs once they do so.

The two-part model achieves an Root Mean Square Error of $1599 and a Mean Absolute Error of $553. The predictive performance was robust at the aggregate level, with the total predicted cost within 1.8% of the actual total, although the model underestimates extreme outliers (Figure 3.1).

Distribution of prediction errors (Actual Cost − Predicted Cost) by expenditure category. The boxplots display the median and interquartile range of errors for patients grouped by their actual healthcare expenditure and the red dashed line represents a perfect prediction. The large positive errors for the highest cost categories show that the model fails to account for the full magnitude of catastrophic health expenditures.

Figure 3.1: Distribution of prediction errors (Actual Cost − Predicted Cost) by expenditure category. The boxplots display the median and interquartile range of errors for patients grouped by their actual healthcare expenditure and the red dashed line represents a perfect prediction. The large positive errors for the highest cost categories show that the model fails to account for the full magnitude of catastrophic health expenditures.

3.2 Model 2: Health care utilisation

As shown in Table 3.2, utilisation patterns mirrored expenditure. Black and “Other” ethnic groups showed significantly lower visit counts compared to White patients (Incident Rate Ratios < 1), consistent with structural barriers to access and utilisation documented in the US healthcare system (Waidmann and Rajan 2000; Macias-Konstantopoulos et al. 2023). Non-physician visits (ndvisit) were positively associated with doctor visits (approximately 11% increase per visit), suggesting complementarity rather than substitution between provider types.

3.3 Dependence structure

The copula analysis of positive spenders revealed a moderate-to-strong positive dependence between frequency and severity. This implies that patients who have more doctor’s appointments also tend to have higher-than-expected costs per visit, suggesting a compounding resource burden for patients with higher health care demands.

As shown in Figure 3.2, there is a clear positive association between the number of visits and total expenditure. While some mechanical correlation is expected (more visits naturally equal more cost), the estimated Gaussian copula parameter (Kendall’s \(\tau =\) 0.522) indicates a dependence that extends beyond simple accumulation. This finding is in agreement with recent work by Marra and Radice (2025b) who used the same MEPS data, and also found that ‘heavy’ users are distinct not only in their visit frequency but also in their resource consumption intensity. This supports the use of a joint frequency-severity modelling framework over independent models.

Joint distribution of healthcare utilisation and expenditure for patients with at least one doctor's visit. The vertical banding reflects the discrete nature of visit counts and the y-axis shows total doctor-visit expenditure (dvexpend) on a logarithmic scale to accommodate skewness. The overlaying positive trend highlights that expenditure increases with visit frequency. The model-estimated Kendall’s $   au$ quantifies the residual dependence, confirming that frequency and severity are positively correlated even after adjusting for covariates.

Figure 3.2: Joint distribution of healthcare utilisation and expenditure for patients with at least one doctor’s visit. The vertical banding reflects the discrete nature of visit counts and the y-axis shows total doctor-visit expenditure (dvexpend) on a logarithmic scale to accommodate skewness. The overlaying positive trend highlights that expenditure increases with visit frequency. The model-estimated Kendall’s $ au$ quantifies the residual dependence, confirming that frequency and severity are positively correlated even after adjusting for covariates.

4 Discussion and conclusion

This analysis highlights the complex drivers of healthcare demand. The two-part expenditure model confirms that the decision to seek care and the resulting cost intensity are driven by overlapping but distinct magnitudes of effect. The significant gender and health-status gaps reinforce the need for risk-adjustment models that explicitly account for biological and systemic usage differences (Bertakis et al. 2000).

The utilisation analysis identified significant ethnic disparities in visit frequency. Even after controlling for income (a potential insurance proxy) and self-assessed health status, minority groups accessed physicians less frequently. These findings align with a review by Macias-Konstantopoulos et al. (2023) identifying structural racism and implicit bias as fundamental drivers of health inequities, where systemic factors limit access and quality of care independent of clinical need. However, it should be noted that the review by Macias-Konstantopoulos et al. (2023) was carried out under a critical review framework designed to ‘advance conceptual innovation’ rather than exhaustively assess the literature, and its findings should be interpreted within this context.

A key limitation of this study is the reliance on observational cross-sectional data, which precludes causal inference. While ndvisit was used as a proxy for care-seeking propensity, it may introduce endogeneity if unobserved health shocks drive both variable sets simultaneously. Furthermore, while the Gamma and Negative Binomial models handled skewness and overdispersion well, the diagnostic plots (Appendix A) show some strain in capturing the most extreme right-tail outliers—a common challenge in medical econometrics.

The copula analysis adds a critical dimension: frequency and severity are not independent. The positive Kendall’s \(\tau\) suggests that high-frequency patients are distinct not just in volume, but in the complexity of resources consumed per visit. Future frequency-severity modelling should therefore avoid independence assumptions to prevent the underestimation of aggregate risk for high-cost patient cohorts.

5 Strenghts and Limitations

A key strength of this analysis is the alignment between the data structure and the chosen models. The two-part framework explicitly separates the extensive margin (any use) from the intensive margin (conditional costs), while the Negative Binomial model accommodates overdispersed visit counts. Variable selection was guided by Lasso screening, which helps to identify the most predictive covariates and reduce overfitting, before refitting unpenalised GLMs for interpretation. The models were estimated on a large, nationally representative dataset, and predictive checks suggest good calibration at the population level, with total predicted expenditure closely matching the observed total.

However, several limitations should be noted. First, the models describe associations, not causal effects. The observational nature of MEPS means that unobserved factors—such as underlying disease severity, insurance plan details, or provider characteristics—may confound the relationships between covariates and utilisation or cost. For example, ndvisit is interpreted as a proxy for latent propensity to seek care, but it may also be affected by the same unobserved health shocks that drive doctor visits, introducing simultaneity. Second, the linear predictors assume additive effects on the link scale and do not capture potential non-linearities or interactions (e.g. age by comorbidity); future work could relax this using splines or flexible machine-learning models such as tree-based methods. Third, despite the use of Gamma and Negative Binomial families, the models still struggle to reproduce the extreme right tail of the cost distribution, as evidenced by the large positive errors among the highest-cost patients.

Finally, although residuals and diagnostic plots do not reveal gross mis-specification, the usual GLM assumptions apply at the level of the conditional distribution: independence of observations, correct specification of the mean–variance relationship, and inclusion of the most important predictors. Violations of these assumptions, together with potential omitted variable bias, limit the extent to which conclusions can be generalised or interpreted in a causal sense. Nevertheless, the models provide a coherent descriptive summary of how doctor-visit utilisation and expenditure vary with observable demographic, socioeconomic, and health characteristics.

Appendices

A Model Diagnostics

A.1 Expenditure Model (Gamma GLM)

The Gamma model with log link handles the skewness of positive expenditures. The residual plots below show no severe deviations, though some ‘strain’ is visible at the highest fitted values.

Diagnostic plots for the Gamma GLM (Expenditure).

Figure A.1: Diagnostic plots for the Gamma GLM (Expenditure).

  • The Q-Q plot shows points deviating sharply upwards from the diagonal line at the upper quantiles. This confirms the model’s limitation in capturing the most extreme catastrophic costs.
  • The Residuals vs Fitted plot displays a fanning pattern where spread increases with fitted values. This is expected for a Gamma specification, where variance scales with the mean.
  • Influential Observations: Cook’s distance values are below 0.5, indicating that no single high-cost patient is disproportionately driving the model parameters.
Deviance residuals versus fitted values for the Gamma GLM of conditional doctor-visit expenditure. Each point represents a respondent and the dashed line indicates a residual value of zero.

Figure A.2: Deviance residuals versus fitted values for the Gamma GLM of conditional doctor-visit expenditure. Each point represents a respondent and the dashed line indicates a residual value of zero.

A.2 Utiliation model (Negative Binomial)

The Negative Binomial model accounts for overdispersion in visit counts. The diagnostics indicate a reasonable fit for the count data structure.

Diagnostic plots for the Negative Binomial (Utilisation).

Figure A.3: Diagnostic plots for the Negative Binomial (Utilisation).

  • The Q-Q plot follows the reference line closely, including the tails showing the Negative Binomial distribution successfully accounts for the overdispersion in the visit count data.
  • While there are some high-frequency users, Cook’s distance remains low, suggesting the model is robust to these ‘extreme’ users.
Deviance residuals versus fitted values for the Negative Binomial regression of doctor-visit counts.

Figure A.4: Deviance residuals versus fitted values for the Negative Binomial regression of doctor-visit counts.

Deviance residuals (utilisation) show no strong systematic pattern, with most observations clustered near zero for low fitted visit counts. Variability increases slightly with the fitted mean and a few large positive residuals appear among the highest predicted counts, indicating some under-prediction for the heaviest users but no major violation of the Negative Binomial mean–variance structure.

B Reproducibility, accessibility & declarations (Gen-AI & word count)

B.1 Reproducibility & accessibility

An accessible HTML version of this report is available via a public GitHub page:

This report was created in R Markdown. The source code is open source and is available via:


Agency for Healthcare Research and Quality. 2023. “Medical Expenditure Panel Survey: Download Data Files, Documentation, and Codebooks.” https://meps.ahrq.gov/mepsweb/data_stats/download_data_files.jsp; U.S. Department of Health; Human Services.
Bertakis, K. D., R. Azari, L. J. Helms, E. J. Callahan, and J. A. Robbins. 2000. Gender Differences in the Utilization of Health Care Services. The Journal of Family Practice 49 (2): 147–52.
Blough, David K., Carolyn W. Madden, and Mark C. Hornbrook. 1999. “Modeling Risk Using Generalized Linear Models.” Journal of Health Economics 18 (2): 153–71. https://doi.org/https://doi.org/10.1016/S0167-6296(98)00032-0.
Deb, Partha, and Edward C. Norton. 2018. “Modeling Health Care Expenditures and Use.” Annual Review of Public Health 39 (1): 489–505. https://doi.org/10.1146/annurev-publhealth-040617-013517.
Keehan, Sean P., Andrew J. Madison, John A. Poisal, Gigi A. Cuckler, Sheila D. Smith, Andrea M. Sisko, Jacqueline A. Fiore, and Kathryn E. Rennie. 2025. “National Health Expenditure Projections, 2024-33: Despite Insurance Coverage Declines, Health to Grow as Share of GDP.” Health Affairs 44 (7): 776–87. https://doi.org/10.1377/hlthaff.2025.00545.
Macias-Konstantopoulos, Wendy L., Kimberly A. Collins, Rosemarie Diaz, Herbert C. Duber, Courtney D. Edwards, Anthony P. Hsu, Megan L. Ranney, Ralph J. Riviello, Zachary S. Wettstein, and Carolyn J. Sachs. 2023. “Race, Healthcare, and Health Disparities: A Critical Review and Recommendations for Advancing Health Equity.” Western Journal of Emergency Medicine: Integrating Emergency Care with Population Health 24 (5). https://doi.org/10.5811/westjem.58408.
Marra, Giampiero, and Rosalba Radice. 2025a. Copula Additive Distributional Regression Using R. 1st ed. New York: Chapman; Hall/CRC. https://doi.org/10.1201/9781003593195.
———. 2025b. “Modelling Physician Visit Frequency and Costs Using a Copula Additive Distributional Regression Approach.” Journal of the Royal Statistical Society Series C: Applied Statistics, September, qlaf050. https://doi.org/10.1093/jrsssc/qlaf050.
Waidmann, Timothy A., and Shruti Rajan. 2000. “Race and Ethnic Disparities in Health Care Access and Utilization: An Examination of State Variation.” Medical Care Research and Review 57 (1_suppl): 55–84. https://doi.org/10.1177/1077558700057001S04.